The Generalized Web Surfer
نویسندگان
چکیده
Different models have been proposed for improving the results of Web search by taking into account the link structure of the Web. The PageRank algorithm models the behavior of a random surfer alternating between random jumps to new pages and following out links with equal probability. We propose to improve on PageRank by using an intelligent surfer that combines link structure and content to decide on its next transition. The intelligent surfer is guided by the textual authority of the web page. The textual authority gives a non-topical estimate of the intrinsic quality of a web page, and when combined with the link-based social authority gives a more complete and robust estimate of the document authoritativeness. Experiments on a number of queries indicate that our algorithm significantly outperforms PageRank algorithms in the human-rated quality of the pages returned while retaining the efficiency and topic-independence characteristics of PageRank.
منابع مشابه
Issues of Learning the Browsing Language
The web is pervading all walks of life and its huge increase in information volume has made the web personalization mandatory. Web Personalization may be achieved by web mining especially the web usage mining technique on the surfing behavior. Learning the surfing behavioral pattern has emerged into a promising research area to achieve web personalization. Till recently web usage mining was don...
متن کاملKemeny's Constant and the Random Surfer
We revisit Kemeny’s constant in the context of Web navigation, also known as “surfing”. We derive bounds on the constant and give it a novel interpretation in terms of the number of links a random surfer will follow to reach his final destination.
متن کامل3D-SURFER: software for high-throughput protein surface comparison and analysis
SUMMARY We present 3D-SURFER, a web-based tool designed to facilitate high-throughput comparison and characterization of proteins based on their surface shape. As each protein is effectively represented by a vector of 3D Zernike descriptors, comparison times for a query protein against the entire PDB take, on an average, only a couple of seconds. The web interface has been designed to be as int...
متن کاملMFCRank: A Web Ranking Algorithm Based on Correlation of Multiple Features
This paper presents a new ranking algorithm MFCRank for topic-specific Web search systems. The basic idea is to correlate two types of similarity information into a unified link analysis model so that the rich content and link features in Web collections can be exploited efficiently to improve the ranking performance. First, a new surfer model JBC is proposed, under which the topic similarity i...
متن کاملThe Intelligent surfer: Probabilistic Combination of Link and Content Information in PageRank
The PageRank algorithm, used in the Google search engine, greatly improves the results of Web search by taking into account the link structure of the Web. PageRank assigns to a page a score proportional to the number of times a random surfer would visit that page, if it surfed indefinitely from page to page, following all outlinks from a page with equal probability. We propose to improve PageRa...
متن کامل